Overview

Brought to you by YData

Dataset statistics

Number of variables22
Number of observations44322
Missing cells82372
Missing cells (%)8.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory51.6 MiB
Average record size in memory1.2 KiB

Variable types

Categorical12
Text5
Unsupported1
Numeric4

Alerts

Year has constant value "2025"Constant
Occupation code is highly overall correlated with Occupation and 1 other fieldsHigh correlation
Subcell code is highly overall correlated with Characteristic and 2 other fieldsHigh correlation
Characteristic is highly overall correlated with Characteristic category and 1 other fieldsHigh correlation
Characteristic category is highly overall correlated with Characteristic and 1 other fieldsHigh correlation
Datatype is highly overall correlated with Datatype code and 1 other fieldsHigh correlation
Datatype code is highly overall correlated with DatatypeHigh correlation
Estimate category is highly overall correlated with Estimate codeHigh correlation
Estimate code is highly overall correlated with Estimate category and 2 other fieldsHigh correlation
Estimate footnote is highly overall correlated with Datatype and 2 other fieldsHigh correlation
Industry is highly overall correlated with Industry codeHigh correlation
Industry code is highly overall correlated with IndustryHigh correlation
Occupation is highly overall correlated with Occupation codeHigh correlation
Ownership is highly overall correlated with Ownership codeHigh correlation
Ownership code is highly overall correlated with OwnershipHigh correlation
Standard error footnote is highly overall correlated with Estimate code and 1 other fieldsHigh correlation
Industry is highly imbalanced (63.3%)Imbalance
Occupation is highly imbalanced (60.4%)Imbalance
Industry code is highly imbalanced (63.3%)Imbalance
Estimate footnote has 41116 (92.8%) missing valuesMissing
Standard error footnote has 41256 (93.1%) missing valuesMissing
Series ID has unique valuesUnique
Estimate is an unsupported type, check if it needs cleaning or further analysisUnsupported
Estimate code has 3722 (8.4%) zerosZeros
Occupation code has 34508 (77.9%) zerosZeros
Subcell code has 21150 (47.7%) zerosZeros

Reproduction

Analysis started2025-11-06 20:04:08.620020
Analysis finished2025-11-06 20:04:12.314188
Duration3.69 seconds
Software versionydata-profiling vv4.17.0
Download configurationconfig.json

Variables

Estimate category
Categorical

High correlation 

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.2 MiB
Insurance benefits
12555 
Healthcare benefits
9920 
Leave benefits
8681 
Retirement benefits
5445 
Benefit combinations
3722 
Other values (2)
3999 

Length

Max length24
Median length20
Mean length17.869252
Min length14

Characters and Unicode

Total characters792001
Distinct characters25
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBenefit combinations
2nd rowBenefit combinations
3rd rowBenefit combinations
4th rowBenefit combinations
5th rowBenefit combinations

Common Values

ValueCountFrequency (%)
Insurance benefits12555
28.3%
Healthcare benefits9920
22.4%
Leave benefits8681
19.6%
Retirement benefits5445
12.3%
Benefit combinations3722
 
8.4%
Financial benefits2979
 
6.7%
Quality of life benefits1020
 
2.3%

Length

2025-11-07T01:34:12.391558image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-07T01:34:12.424686image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
benefits40600
44.8%
insurance12555
 
13.8%
healthcare9920
 
10.9%
leave8681
 
9.6%
retirement5445
 
6.0%
benefit3722
 
4.1%
combinations3722
 
4.1%
financial2979
 
3.3%
quality1020
 
1.1%
of1020
 
1.1%

Most occurring characters

ValueCountFrequency (%)
e155756
19.7%
n88279
11.1%
t69874
8.8%
i65209
8.2%
s56877
 
7.2%
a51776
 
6.5%
46362
 
5.9%
f46362
 
5.9%
b44322
 
5.6%
c29176
 
3.7%
Other values (15)138008
17.4%

Most occurring categories

ValueCountFrequency (%)
(unknown)792001
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e155756
19.7%
n88279
11.1%
t69874
8.8%
i65209
8.2%
s56877
 
7.2%
a51776
 
6.5%
46362
 
5.9%
f46362
 
5.9%
b44322
 
5.6%
c29176
 
3.7%
Other values (15)138008
17.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown)792001
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e155756
19.7%
n88279
11.1%
t69874
8.8%
i65209
8.2%
s56877
 
7.2%
a51776
 
6.5%
46362
 
5.9%
f46362
 
5.9%
b44322
 
5.6%
c29176
 
3.7%
Other values (15)138008
17.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown)792001
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e155756
19.7%
n88279
11.1%
t69874
8.8%
i65209
8.2%
s56877
 
7.2%
a51776
 
6.5%
46362
 
5.9%
f46362
 
5.9%
b44322
 
5.6%
c29176
 
3.7%
Other values (15)138008
17.4%

Datatype
Categorical

High correlation 

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.0 MiB
Access rate
16666 
Participation rate
13771 
Average
3819 
50th percentile - median
2332 
Take-up rate
1902 
Other values (5)
5832 

Length

Max length24
Median length18
Mean length14.114729
Min length7

Characters and Unicode

Total characters625593
Distinct characters31
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAccess rate
2nd rowAccess rate
3rd rowAccess rate
4th rowAccess rate
5th rowAccess rate

Common Values

ValueCountFrequency (%)
Access rate16666
37.6%
Participation rate13771
31.1%
Average3819
 
8.6%
50th percentile - median2332
 
5.3%
Take-up rate1902
 
4.3%
90th percentile1328
 
3.0%
75th percentile1324
 
3.0%
10th percentile1254
 
2.8%
25th percentile1234
 
2.8%
Share of premiums692
 
1.6%

Length

2025-11-07T01:34:12.489908image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-07T01:34:12.524216image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
rate32339
35.9%
access16666
18.5%
participation13771
15.3%
percentile7472
 
8.3%
average3819
 
4.2%
50th2332
 
2.6%
2332
 
2.6%
median2332
 
2.6%
take-up1902
 
2.1%
90th1328
 
1.5%
Other values (6)5888
 
6.5%

Most occurring characters

ValueCountFrequency (%)
e84677
13.5%
t74825
12.0%
a68626
11.0%
r58785
9.4%
c54575
8.7%
i51809
8.3%
45859
7.3%
s34024
 
5.4%
p23837
 
3.8%
n23575
 
3.8%
Other values (21)105001
16.8%

Most occurring categories

ValueCountFrequency (%)
(unknown)625593
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e84677
13.5%
t74825
12.0%
a68626
11.0%
r58785
9.4%
c54575
8.7%
i51809
8.3%
45859
7.3%
s34024
 
5.4%
p23837
 
3.8%
n23575
 
3.8%
Other values (21)105001
16.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown)625593
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e84677
13.5%
t74825
12.0%
a68626
11.0%
r58785
9.4%
c54575
8.7%
i51809
8.3%
45859
7.3%
s34024
 
5.4%
p23837
 
3.8%
n23575
 
3.8%
Other values (21)105001
16.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown)625593
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e84677
13.5%
t74825
12.0%
a68626
11.0%
r58785
9.4%
c54575
8.7%
i51809
8.3%
45859
7.3%
s34024
 
5.4%
p23837
 
3.8%
n23575
 
3.8%
Other values (21)105001
16.8%
Distinct342
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size4.9 MiB
2025-11-07T01:34:12.640939image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length115
Median length80
Mean length59.7323
Min length19

Characters and Unicode

Total characters2647455
Distinct characters52
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAccess to both medical care and retirement benefits
2nd rowAccess to medical care and no retirement benefits
3rd rowAccess to retirement and no medical care benefits
4th rowAccess to no medical care and no retirement benefits
5th rowAccess to both medical care benefits and life insurance plans
ValueCountFrequency (%)
of17482
 
4.6%
to13885
 
3.7%
access11121
 
2.9%
care10363
 
2.7%
medical8982
 
2.4%
contribution8970
 
2.4%
paid8854
 
2.3%
benefits8134
 
2.1%
fixed8100
 
2.1%
amount7951
 
2.1%
Other values (202)274973
72.6%
2025-11-07T01:34:12.793172image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
334493
12.6%
e292303
 
11.0%
a186235
 
7.0%
i184446
 
7.0%
n176494
 
6.7%
t162517
 
6.1%
o147172
 
5.6%
r144195
 
5.4%
s108661
 
4.1%
l106928
 
4.0%
Other values (42)804011
30.4%

Most occurring categories

ValueCountFrequency (%)
(unknown)2647455
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
334493
12.6%
e292303
 
11.0%
a186235
 
7.0%
i184446
 
7.0%
n176494
 
6.7%
t162517
 
6.1%
o147172
 
5.6%
r144195
 
5.4%
s108661
 
4.1%
l106928
 
4.0%
Other values (42)804011
30.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown)2647455
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
334493
12.6%
e292303
 
11.0%
a186235
 
7.0%
i184446
 
7.0%
n176494
 
6.7%
t162517
 
6.1%
o147172
 
5.6%
r144195
 
5.4%
s108661
 
4.1%
l106928
 
4.0%
Other values (42)804011
30.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown)2647455
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
334493
12.6%
e292303
 
11.0%
a186235
 
7.0%
i184446
 
7.0%
n176494
 
6.7%
t162517
 
6.1%
o147172
 
5.6%
r144195
 
5.4%
s108661
 
4.1%
l106928
 
4.0%
Other values (42)804011
30.4%

Ownership
Categorical

High correlation 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.4 MiB
Private industry workers
17054 
Civilian workers
14621 
State and local government workers
12131 
Local government workers
 
272
State government workers
 
244

Length

Max length34
Median length24
Mean length24.097965
Min length16

Characters and Unicode

Total characters1068070
Distinct characters23
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCivilian workers
2nd rowCivilian workers
3rd rowCivilian workers
4th rowCivilian workers
5th rowCivilian workers

Common Values

ValueCountFrequency (%)
Private industry workers17054
38.5%
Civilian workers14621
33.0%
State and local government workers12131
27.4%
Local government workers272
 
0.6%
State government workers244
 
0.6%

Length

2025-11-07T01:34:12.841471image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-07T01:34:12.874240image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
workers44322
31.1%
private17054
 
12.0%
industry17054
 
12.0%
civilian14621
 
10.3%
government12647
 
8.9%
local12403
 
8.7%
state12375
 
8.7%
and12131
 
8.5%

Most occurring characters

ValueCountFrequency (%)
r135399
12.7%
e99045
 
9.3%
98285
 
9.2%
i77971
 
7.3%
t71505
 
6.7%
o69372
 
6.5%
n69100
 
6.5%
a68584
 
6.4%
s61376
 
5.7%
v44322
 
4.1%
Other values (13)273111
25.6%

Most occurring categories

ValueCountFrequency (%)
(unknown)1068070
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
r135399
12.7%
e99045
 
9.3%
98285
 
9.2%
i77971
 
7.3%
t71505
 
6.7%
o69372
 
6.5%
n69100
 
6.5%
a68584
 
6.4%
s61376
 
5.7%
v44322
 
4.1%
Other values (13)273111
25.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown)1068070
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
r135399
12.7%
e99045
 
9.3%
98285
 
9.2%
i77971
 
7.3%
t71505
 
6.7%
o69372
 
6.5%
n69100
 
6.5%
a68584
 
6.4%
s61376
 
5.7%
v44322
 
4.1%
Other values (13)273111
25.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown)1068070
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
r135399
12.7%
e99045
 
9.3%
98285
 
9.2%
i77971
 
7.3%
t71505
 
6.7%
o69372
 
6.5%
n69100
 
6.5%
a68584
 
6.4%
s61376
 
5.7%
v44322
 
4.1%
Other values (13)273111
25.6%

Industry
Categorical

High correlation  Imbalance 

Distinct28
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size3.1 MiB
All industries
34496 
Service-providing
 
813
Education and health services
 
770
Educational services
 
755
Junior colleges, colleges, universities, and professional schools
 
709
Other values (23)
6779 

Length

Max length72
Median length14
Mean length16.879834
Min length9

Characters and Unicode

Total characters748148
Distinct characters43
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAll industries
2nd rowAll industries
3rd rowAll industries
4th rowAll industries
5th rowAll industries

Common Values

ValueCountFrequency (%)
All industries34496
77.8%
Service-providing813
 
1.8%
Education and health services770
 
1.7%
Educational services755
 
1.7%
Junior colleges, colleges, universities, and professional schools709
 
1.6%
Health care and social assistance682
 
1.5%
Hospitals620
 
1.4%
Elementary and secondary schools526
 
1.2%
Goods-producing525
 
1.2%
Public administration507
 
1.1%
Other values (18)3919
 
8.8%

Length

2025-11-07T01:34:12.926067image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
all34496
34.7%
industries34496
34.7%
and5077
 
5.1%
services2523
 
2.5%
health1452
 
1.5%
colleges1418
 
1.4%
schools1235
 
1.2%
professional1171
 
1.2%
service-providing813
 
0.8%
education770
 
0.8%
Other values (46)15826
15.9%

Most occurring characters

ValueCountFrequency (%)
i92986
12.4%
s91873
12.3%
l81053
10.8%
e58258
7.8%
54955
7.3%
n54955
7.3%
r49412
 
6.6%
t48833
 
6.5%
d46093
 
6.2%
u40913
 
5.5%
Other values (33)128817
17.2%

Most occurring categories

ValueCountFrequency (%)
(unknown)748148
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
i92986
12.4%
s91873
12.3%
l81053
10.8%
e58258
7.8%
54955
7.3%
n54955
7.3%
r49412
 
6.6%
t48833
 
6.5%
d46093
 
6.2%
u40913
 
5.5%
Other values (33)128817
17.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown)748148
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
i92986
12.4%
s91873
12.3%
l81053
10.8%
e58258
7.8%
54955
7.3%
n54955
7.3%
r49412
 
6.6%
t48833
 
6.5%
d46093
 
6.2%
u40913
 
5.5%
Other values (33)128817
17.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown)748148
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
i92986
12.4%
s91873
12.3%
l81053
10.8%
e58258
7.8%
54955
7.3%
n54955
7.3%
r49412
 
6.6%
t48833
 
6.5%
d46093
 
6.2%
u40913
 
5.5%
Other values (33)128817
17.2%

Occupation
Categorical

High correlation  Imbalance 

Distinct18
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.3 MiB
All occupations
34508 
Management, professional and related occupations
 
804
Professional and related occupations
 
799
Sales and office occupations
 
790
Office and administrative support occupations
 
780
Other values (13)
6641 

Length

Max length72
Median length15
Mean length20.821082
Min length8

Characters and Unicode

Total characters922832
Distinct characters33
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAll occupations
2nd rowAll occupations
3rd rowAll occupations
4th rowAll occupations
5th rowAll occupations

Common Values

ValueCountFrequency (%)
All occupations34508
77.9%
Management, professional and related occupations804
 
1.8%
Professional and related occupations799
 
1.8%
Sales and office occupations790
 
1.8%
Office and administrative support occupations780
 
1.8%
Service occupations758
 
1.7%
Natural resources, construction, and maintenance occupations743
 
1.7%
Production, transportation, and material moving occupations737
 
1.7%
Transportation and material moving occupations518
 
1.2%
Management, business, and financial occupations514
 
1.2%
Other values (8)3371
 
7.6%

Length

2025-11-07T01:34:12.991023image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
occupations43133
38.1%
all34508
30.5%
and8024
 
7.1%
related2096
 
1.9%
professional1603
 
1.4%
office1570
 
1.4%
management1318
 
1.2%
sales1283
 
1.1%
transportation1255
 
1.1%
material1255
 
1.1%
Other values (27)17204
 
15.2%

Most occurring characters

ValueCountFrequency (%)
o103890
11.3%
c98014
10.6%
l79368
8.6%
a75558
 
8.2%
n72761
 
7.9%
68927
 
7.5%
t63408
 
6.9%
i63244
 
6.9%
s59301
 
6.4%
u49030
 
5.3%
Other values (23)189331
20.5%

Most occurring categories

ValueCountFrequency (%)
(unknown)922832
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o103890
11.3%
c98014
10.6%
l79368
8.6%
a75558
 
8.2%
n72761
 
7.9%
68927
 
7.5%
t63408
 
6.9%
i63244
 
6.9%
s59301
 
6.4%
u49030
 
5.3%
Other values (23)189331
20.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown)922832
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o103890
11.3%
c98014
10.6%
l79368
8.6%
a75558
 
8.2%
n72761
 
7.9%
68927
 
7.5%
t63408
 
6.9%
i63244
 
6.9%
s59301
 
6.4%
u49030
 
5.3%
Other values (23)189331
20.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown)922832
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o103890
11.3%
c98014
10.6%
l79368
8.6%
a75558
 
8.2%
n72761
 
7.9%
68927
 
7.5%
t63408
 
6.9%
i63244
 
6.9%
s59301
 
6.4%
u49030
 
5.3%
Other values (23)189331
20.5%

Characteristic category
Categorical

High correlation 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.0 MiB
All workers
21150 
Census area
9258 
Establishment size
5577 
Average wage category
4582 
Work status
 
1883

Length

Max length21
Median length11
Mean length13.16802
Min length11

Characters and Unicode

Total characters583633
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAll workers
2nd rowAll workers
3rd rowAll workers
4th rowAll workers
5th rowAll workers

Common Values

ValueCountFrequency (%)
All workers21150
47.7%
Census area9258
20.9%
Establishment size5577
 
12.6%
Average wage category4582
 
10.3%
Work status1883
 
4.2%
Bargaining status1872
 
4.2%

Length

2025-11-07T01:34:13.040026image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-07T01:34:13.072933image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
all21150
22.7%
workers21150
22.7%
census9258
9.9%
area9258
9.9%
establishment5577
 
6.0%
size5577
 
6.0%
average4582
 
4.9%
wage4582
 
4.9%
category4582
 
4.9%
status3755
 
4.0%
Other values (2)3755
 
4.0%

Most occurring characters

ValueCountFrequency (%)
e69148
11.8%
r64477
11.0%
s63907
10.9%
48904
 
8.4%
l47877
 
8.2%
a45338
 
7.8%
o27615
 
4.7%
A25732
 
4.4%
w25732
 
4.4%
t23246
 
4.0%
Other values (16)141657
24.3%

Most occurring categories

ValueCountFrequency (%)
(unknown)583633
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e69148
11.8%
r64477
11.0%
s63907
10.9%
48904
 
8.4%
l47877
 
8.2%
a45338
 
7.8%
o27615
 
4.7%
A25732
 
4.4%
w25732
 
4.4%
t23246
 
4.0%
Other values (16)141657
24.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown)583633
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e69148
11.8%
r64477
11.0%
s63907
10.9%
48904
 
8.4%
l47877
 
8.2%
a45338
 
7.8%
o27615
 
4.7%
A25732
 
4.4%
w25732
 
4.4%
t23246
 
4.0%
Other values (16)141657
24.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown)583633
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e69148
11.8%
r64477
11.0%
s63907
10.9%
48904
 
8.4%
l47877
 
8.2%
a45338
 
7.8%
o27615
 
4.7%
A25732
 
4.4%
w25732
 
4.4%
t23246
 
4.0%
Other values (16)141657
24.3%

Characteristic
Categorical

High correlation 

Distinct30
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size2.9 MiB
All workers
21150 
Full-time
 
994
100 workers or more
 
984
Nonunion
 
978
500 workers or more
 
965
Other values (25)
19251 

Length

Max length21
Median length20
Mean length12.334913
Min length4

Characters and Unicode

Total characters546708
Distinct characters40
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAll workers
2nd rowAll workers
3rd rowAll workers
4th rowAll workers
5th rowAll workers

Common Values

ValueCountFrequency (%)
All workers21150
47.7%
Full-time994
 
2.2%
100 workers or more984
 
2.2%
Nonunion978
 
2.2%
500 workers or more965
 
2.2%
100-499 workers937
 
2.1%
Less than 100 workers933
 
2.1%
Union894
 
2.0%
Part-time889
 
2.0%
Less than 50 workers885
 
2.0%
Other values (20)14713
33.2%

Length

2025-11-07T01:34:13.123448image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
workers26727
28.1%
all21150
22.2%
percent4582
 
4.8%
253150
 
3.3%
south2799
 
2.9%
central2711
 
2.8%
west2102
 
2.2%
more1949
 
2.0%
or1949
 
2.0%
1001917
 
2.0%
Other values (25)26212
27.5%

Most occurring characters

ValueCountFrequency (%)
r68501
12.5%
e53002
 
9.7%
50926
 
9.3%
l49817
 
9.1%
o41333
 
7.6%
s38366
 
7.0%
w29522
 
5.4%
t28484
 
5.2%
k26727
 
4.9%
A22627
 
4.1%
Other values (30)137403
25.1%

Most occurring categories

ValueCountFrequency (%)
(unknown)546708
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
r68501
12.5%
e53002
 
9.7%
50926
 
9.3%
l49817
 
9.1%
o41333
 
7.6%
s38366
 
7.0%
w29522
 
5.4%
t28484
 
5.2%
k26727
 
4.9%
A22627
 
4.1%
Other values (30)137403
25.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown)546708
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
r68501
12.5%
e53002
 
9.7%
50926
 
9.3%
l49817
 
9.1%
o41333
 
7.6%
s38366
 
7.0%
w29522
 
5.4%
t28484
 
5.2%
k26727
 
4.9%
A22627
 
4.1%
Other values (30)137403
25.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown)546708
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
r68501
12.5%
e53002
 
9.7%
50926
 
9.3%
l49817
 
9.1%
o41333
 
7.6%
s38366
 
7.0%
w29522
 
5.4%
t28484
 
5.2%
k26727
 
4.9%
A22627
 
4.1%
Other values (30)137403
25.1%

Year
Categorical

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
2025
44322 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters177288
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2025
2nd row2025
3rd row2025
4th row2025
5th row2025

Common Values

ValueCountFrequency (%)
202544322
100.0%

Length

2025-11-07T01:34:13.165187image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-07T01:34:13.190263image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
202544322
100.0%

Most occurring characters

ValueCountFrequency (%)
288644
50.0%
044322
25.0%
544322
25.0%

Most occurring categories

ValueCountFrequency (%)
(unknown)177288
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
288644
50.0%
044322
25.0%
544322
25.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown)177288
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
288644
50.0%
044322
25.0%
544322
25.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown)177288
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
288644
50.0%
044322
25.0%
544322
25.0%

Estimate
Unsupported

Rejected  Unsupported 

Missing0
Missing (%)0.0%
Memory size2.5 MiB

Estimate footnote
Categorical

High correlation  Missing 

Distinct4
Distinct (%)0.1%
Missing41116
Missing (%)92.8%
Memory size2.7 MiB
7.0
1401 
6.0
723 
5.0
700 
1.0
382 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters9618
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
7.01401
 
3.2%
6.0723
 
1.6%
5.0700
 
1.6%
1.0382
 
0.9%
(Missing)41116
92.8%

Length

2025-11-07T01:34:13.221063image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-07T01:34:13.248554image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
7.01401
43.7%
6.0723
22.6%
5.0700
21.8%
1.0382
 
11.9%

Most occurring characters

ValueCountFrequency (%)
.3206
33.3%
03206
33.3%
71401
14.6%
6723
 
7.5%
5700
 
7.3%
1382
 
4.0%

Most occurring categories

ValueCountFrequency (%)
(unknown)9618
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
.3206
33.3%
03206
33.3%
71401
14.6%
6723
 
7.5%
5700
 
7.3%
1382
 
4.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown)9618
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
.3206
33.3%
03206
33.3%
71401
14.6%
6723
 
7.5%
5700
 
7.3%
1382
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown)9618
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
.3206
33.3%
03206
33.3%
71401
14.6%
6723
 
7.5%
5700
 
7.3%
1382
 
4.0%
Distinct3095
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Memory size2.5 MiB
2025-11-07T01:34:13.366437image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length9
Median length3
Mean length3.2618113
Min length1

Characters and Unicode

Total characters144570
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2048 ?
Unique (%)4.6%

Sample

1st row0.8
2nd row0.6
3rd row0.6
4th row0.6
5th row0.9
ValueCountFrequency (%)
0.01997
 
4.5%
1.11737
 
3.9%
0.91713
 
3.9%
1.01654
 
3.7%
1.31593
 
3.6%
1.21553
 
3.5%
0.71547
 
3.5%
0.81540
 
3.5%
0.41457
 
3.3%
1.41454
 
3.3%
Other values (3085)28077
63.3%
2025-11-07T01:34:13.533596image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
.44080
30.5%
029329
20.3%
120254
14.0%
212000
 
8.3%
37936
 
5.5%
46311
 
4.4%
55512
 
3.8%
65100
 
3.5%
74826
 
3.3%
94491
 
3.1%
Other values (2)4731
 
3.3%

Most occurring categories

ValueCountFrequency (%)
(unknown)144570
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
.44080
30.5%
029329
20.3%
120254
14.0%
212000
 
8.3%
37936
 
5.5%
46311
 
4.4%
55512
 
3.8%
65100
 
3.5%
74826
 
3.3%
94491
 
3.1%
Other values (2)4731
 
3.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown)144570
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
.44080
30.5%
029329
20.3%
120254
14.0%
212000
 
8.3%
37936
 
5.5%
46311
 
4.4%
55512
 
3.8%
65100
 
3.5%
74826
 
3.3%
94491
 
3.1%
Other values (2)4731
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown)144570
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
.44080
30.5%
029329
20.3%
120254
14.0%
212000
 
8.3%
37936
 
5.5%
46311
 
4.4%
55512
 
3.8%
65100
 
3.5%
74826
 
3.3%
94491
 
3.1%
Other values (2)4731
 
3.3%

Standard error footnote
Categorical

High correlation  Missing 

Distinct4
Distinct (%)0.1%
Missing41256
Missing (%)93.1%
Memory size2.7 MiB
6.0
1401 
5.0
723 
8.0
700 
2.0
242 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters9198
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.0
2nd row2.0
3rd row2.0
4th row2.0
5th row2.0

Common Values

ValueCountFrequency (%)
6.01401
 
3.2%
5.0723
 
1.6%
8.0700
 
1.6%
2.0242
 
0.5%
(Missing)41256
93.1%

Length

2025-11-07T01:34:13.574401image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-07T01:34:13.608321image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
6.01401
45.7%
5.0723
23.6%
8.0700
22.8%
2.0242
 
7.9%

Most occurring characters

ValueCountFrequency (%)
.3066
33.3%
03066
33.3%
61401
15.2%
5723
 
7.9%
8700
 
7.6%
2242
 
2.6%

Most occurring categories

ValueCountFrequency (%)
(unknown)9198
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
.3066
33.3%
03066
33.3%
61401
15.2%
5723
 
7.9%
8700
 
7.6%
2242
 
2.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown)9198
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
.3066
33.3%
03066
33.3%
61401
15.2%
5723
 
7.9%
8700
 
7.6%
2242
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown)9198
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
.3066
33.3%
03066
33.3%
61401
15.2%
5723
 
7.9%
8700
 
7.6%
2242
 
2.6%
Distinct44145
Distinct (%)99.6%
Missing0
Missing (%)0.0%
Memory size8.4 MiB
2025-11-07T01:34:13.691420image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length250
Median length194
Mean length142.47606
Min length51

Characters and Unicode

Total characters6314824
Distinct characters54
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique43968 ?
Unique (%)99.2%

Sample

1st rowPercent of civilian workers with access to both medical care and retirement benefits
2nd rowPercent of civilian workers with access to medical care and no retirement benefits
3rd rowPercent of civilian workers with access to retirement and no medical care benefits
4th rowPercent of civilian workers with access to no medical care and no retirement benefits
5th rowPercent of civilian workers with access to both medical care benefits and life insurance plans
ValueCountFrequency (%)
workers50682
 
5.5%
of49717
 
5.4%
with42900
 
4.6%
percent40475
 
4.4%
in37767
 
4.1%
and29987
 
3.2%
for24023
 
2.6%
to19604
 
2.1%
plans19427
 
2.1%
private17054
 
1.8%
Other values (316)592241
64.1%
2025-11-07T01:34:13.866179image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
879555
13.9%
e628394
 
10.0%
i463569
 
7.3%
n453469
 
7.2%
t447759
 
7.1%
a439215
 
7.0%
r426959
 
6.8%
o352000
 
5.6%
s318965
 
5.1%
c250751
 
4.0%
Other values (44)1654188
26.2%

Most occurring categories

ValueCountFrequency (%)
(unknown)6314824
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
879555
13.9%
e628394
 
10.0%
i463569
 
7.3%
n453469
 
7.2%
t447759
 
7.1%
a439215
 
7.0%
r426959
 
6.8%
o352000
 
5.6%
s318965
 
5.1%
c250751
 
4.0%
Other values (44)1654188
26.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown)6314824
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
879555
13.9%
e628394
 
10.0%
i463569
 
7.3%
n453469
 
7.2%
t447759
 
7.1%
a439215
 
7.0%
r426959
 
6.8%
o352000
 
5.6%
s318965
 
5.1%
c250751
 
4.0%
Other values (44)1654188
26.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown)6314824
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
879555
13.9%
e628394
 
10.0%
i463569
 
7.3%
n453469
 
7.2%
t447759
 
7.1%
a439215
 
7.0%
r426959
 
6.8%
o352000
 
5.6%
s318965
 
5.1%
c250751
 
4.0%
Other values (44)1654188
26.2%

Series ID
Text

Unique 

Distinct44322
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size3.4 MiB
2025-11-07T01:34:14.024720image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length23
Median length23
Mean length23
Min length23

Characters and Unicode

Total characters1019406
Distinct characters17
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique44322 ?
Unique (%)100.0%

Sample

1st rowNBU10000000000000028007
2nd rowNBU10000000000000028008
3rd rowNBU10000000000000028009
4th rowNBU10000000000000028010
5th rowNBU10000000000000028011
ValueCountFrequency (%)
nbu100000000000000280071
 
< 0.1%
nbu100000000000000280141
 
< 0.1%
nbu100000000000000280281
 
< 0.1%
nbu100000000000000280271
 
< 0.1%
nbu100000000000000280261
 
< 0.1%
nbu100000000000000280091
 
< 0.1%
nbu100000000000000280101
 
< 0.1%
nbu100000000000000280111
 
< 0.1%
nbu100000000000000280121
 
< 0.1%
nbu100000000000000280131
 
< 0.1%
Other values (44312)44312
> 99.9%
2025-11-07T01:34:14.124949image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0491317
48.2%
291478
 
9.0%
184036
 
8.2%
359248
 
5.8%
N44322
 
4.3%
B44322
 
4.3%
U44322
 
4.3%
535358
 
3.5%
428112
 
2.8%
626798
 
2.6%
Other values (7)70093
 
6.9%

Most occurring categories

ValueCountFrequency (%)
(unknown)1019406
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0491317
48.2%
291478
 
9.0%
184036
 
8.2%
359248
 
5.8%
N44322
 
4.3%
B44322
 
4.3%
U44322
 
4.3%
535358
 
3.5%
428112
 
2.8%
626798
 
2.6%
Other values (7)70093
 
6.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown)1019406
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0491317
48.2%
291478
 
9.0%
184036
 
8.2%
359248
 
5.8%
N44322
 
4.3%
B44322
 
4.3%
U44322
 
4.3%
535358
 
3.5%
428112
 
2.8%
626798
 
2.6%
Other values (7)70093
 
6.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown)1019406
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0491317
48.2%
291478
 
9.0%
184036
 
8.2%
359248
 
5.8%
N44322
 
4.3%
B44322
 
4.3%
U44322
 
4.3%
535358
 
3.5%
428112
 
2.8%
626798
 
2.6%
Other values (7)70093
 
6.9%

Ownership code
Categorical

High correlation 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.5 MiB
2
17054 
1
14621 
3
12131 
5
 
272
4
 
244

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters44322
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
217054
38.5%
114621
33.0%
312131
27.4%
5272
 
0.6%
4244
 
0.6%

Length

2025-11-07T01:34:14.179796image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-07T01:34:14.208174image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
217054
38.5%
114621
33.0%
312131
27.4%
5272
 
0.6%
4244
 
0.6%

Most occurring characters

ValueCountFrequency (%)
217054
38.5%
114621
33.0%
312131
27.4%
5272
 
0.6%
4244
 
0.6%

Most occurring categories

ValueCountFrequency (%)
(unknown)44322
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
217054
38.5%
114621
33.0%
312131
27.4%
5272
 
0.6%
4244
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown)44322
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
217054
38.5%
114621
33.0%
312131
27.4%
5272
 
0.6%
4244
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown)44322
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
217054
38.5%
114621
33.0%
312131
27.4%
5272
 
0.6%
4244
 
0.6%

Estimate code
Real number (ℝ)

High correlation  Zeros 

Distinct38
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20.944903
Minimum0
Maximum98
Zeros3722
Zeros (%)8.4%
Negative0
Negative (%)0.0%
Memory size346.4 KiB
2025-11-07T01:34:14.249524image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q17
median15
Q317
95-th percentile89
Maximum98
Range98
Interquartile range (IQR)10

Descriptive statistics

Standard deviation23.66215
Coefficient of variation (CV)1.1297331
Kurtosis3.1897975
Mean20.944903
Median Absolute Deviation (MAD)3
Skewness2.0975043
Sum928320
Variance559.89733
MonotonicityNot monotonic
2025-11-07T01:34:14.291675image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=38)
ValueCountFrequency (%)
158197
18.5%
164717
10.6%
144521
10.2%
03722
8.4%
173144
 
7.1%
52879
 
6.5%
122635
 
5.9%
72600
 
5.9%
192547
 
5.7%
62338
 
5.3%
Other values (28)7022
15.8%
ValueCountFrequency (%)
03722
8.4%
52879
 
6.5%
62338
 
5.3%
72600
 
5.9%
122635
 
5.9%
144521
10.2%
158197
18.5%
164717
10.6%
173144
 
7.1%
192547
 
5.7%
ValueCountFrequency (%)
98172
 
0.4%
97171
 
0.4%
94346
 
0.8%
901386
3.1%
89173
 
0.4%
88173
 
0.4%
86173
 
0.4%
85172
 
0.4%
84173
 
0.4%
83173
 
0.4%

Industry code
Categorical

High correlation  Imbalance 

Distinct28
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
000000
34496 
S00000
 
813
600000
 
770
610000
 
755
612000
 
709
Other values (23)
6779 

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters265932
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row000000
2nd row000000
3rd row000000
4th row000000
5th row000000

Common Values

ValueCountFrequency (%)
00000034496
77.8%
S00000813
 
1.8%
600000770
 
1.7%
610000755
 
1.7%
612000709
 
1.6%
620000682
 
1.5%
622000620
 
1.4%
611100526
 
1.2%
G00000525
 
1.2%
920000507
 
1.1%
Other values (18)3919
 
8.8%

Length

2025-11-07T01:34:14.340207image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
00000034496
77.8%
s00000813
 
1.8%
600000770
 
1.7%
610000755
 
1.7%
612000709
 
1.6%
620000682
 
1.5%
622000620
 
1.4%
611100526
 
1.2%
g00000525
 
1.2%
920000507
 
1.1%
Other values (18)3919
 
8.8%

Most occurring characters

ValueCountFrequency (%)
0245533
92.3%
25177
 
1.9%
64243
 
1.6%
13683
 
1.4%
51974
 
0.7%
41625
 
0.6%
3822
 
0.3%
S813
 
0.3%
G525
 
0.2%
9507
 
0.2%
Other values (3)1030
 
0.4%

Most occurring categories

ValueCountFrequency (%)
(unknown)265932
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0245533
92.3%
25177
 
1.9%
64243
 
1.6%
13683
 
1.4%
51974
 
0.7%
41625
 
0.6%
3822
 
0.3%
S813
 
0.3%
G525
 
0.2%
9507
 
0.2%
Other values (3)1030
 
0.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown)265932
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0245533
92.3%
25177
 
1.9%
64243
 
1.6%
13683
 
1.4%
51974
 
0.7%
41625
 
0.6%
3822
 
0.3%
S813
 
0.3%
G525
 
0.2%
9507
 
0.2%
Other values (3)1030
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown)265932
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0245533
92.3%
25177
 
1.9%
64243
 
1.6%
13683
 
1.4%
51974
 
0.7%
41625
 
0.6%
3822
 
0.3%
S813
 
0.3%
G525
 
0.2%
9507
 
0.2%
Other values (3)1030
 
0.4%

Occupation code
Real number (ℝ)

High correlation  Zeros 

Distinct18
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean77979.51
Minimum0
Maximum530000
Zeros34508
Zeros (%)77.9%
Negative0
Negative (%)0.0%
Memory size346.4 KiB
2025-11-07T01:34:14.373696image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile490000
Maximum530000
Range530000
Interquartile range (IQR)0

Descriptive statistics

Standard deviation160886.43
Coefficient of variation (CV)2.0631885
Kurtosis1.6665603
Mean77979.51
Median Absolute Deviation (MAD)0
Skewness1.8195786
Sum3.4562079 × 109
Variance2.5884443 × 1010
MonotonicityNot monotonic
2025-11-07T01:34:14.407914image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%)
034508
77.9%
112900804
 
1.8%
152900799
 
1.8%
414300790
 
1.8%
430000780
 
1.8%
313900758
 
1.7%
454900743
 
1.7%
515300737
 
1.7%
530000518
 
1.2%
111300514
 
1.2%
Other values (8)3371
 
7.6%
ValueCountFrequency (%)
034508
77.9%
111300514
 
1.2%
112900804
 
1.8%
152900799
 
1.8%
250001501
 
1.1%
252000465
 
1.0%
291111223
 
0.5%
313900758
 
1.7%
330000241
 
0.5%
410000493
 
1.1%
ValueCountFrequency (%)
530000518
1.2%
515300737
1.7%
510000508
1.1%
490000499
1.1%
454900743
1.7%
454700441
1.0%
430000780
1.8%
414300790
1.8%
410000493
1.1%
330000241
 
0.5%

Subcell code
Real number (ℝ)

High correlation  Zeros 

Distinct42
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.429042
Minimum0
Maximum59
Zeros21150
Zeros (%)47.7%
Negative0
Negative (%)0.0%
Memory size346.4 KiB
2025-11-07T01:34:14.455261image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median2
Q317
95-th percentile48
Maximum59
Range59
Interquartile range (IQR)17

Descriptive statistics

Standard deviation15.070878
Coefficient of variation (CV)1.4450875
Kurtosis1.9337808
Mean10.429042
Median Absolute Deviation (MAD)2
Skewness1.6384727
Sum462236
Variance227.13137
MonotonicityNot monotonic
2025-11-07T01:34:14.491200image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=42)
ValueCountFrequency (%)
021150
47.7%
25994
 
2.2%
5984
 
2.2%
24978
 
2.2%
7965
 
2.2%
6937
 
2.1%
1933
 
2.1%
23894
 
2.0%
26889
 
2.0%
2885
 
2.0%
Other values (32)14713
33.2%
ValueCountFrequency (%)
021150
47.7%
1933
 
2.1%
2885
 
2.0%
4873
 
2.0%
5984
 
2.2%
6937
 
2.1%
7965
 
2.2%
8751
 
1.7%
9602
 
1.4%
10739
 
1.7%
ValueCountFrequency (%)
59262
0.6%
58265
0.6%
57269
0.6%
56243
0.5%
55269
0.6%
54253
0.6%
51255
0.6%
49259
0.6%
48263
0.6%
46240
0.5%

Datatype code
Real number (ℝ)

High correlation 

Distinct13
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25.937525
Minimum20
Maximum33
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size346.4 KiB
2025-11-07T01:34:14.541110image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum20
5-th percentile20
Q120
median28
Q330
95-th percentile33
Maximum33
Range13
Interquartile range (IQR)10

Descriptive statistics

Standard deviation4.6329404
Coefficient of variation (CV)0.17861921
Kurtosis-1.4106781
Mean25.937525
Median Absolute Deviation (MAD)4
Skewness-0.034110672
Sum1149603
Variance21.464137
MonotonicityNot monotonic
2025-11-07T01:34:14.585806image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
2011869
26.8%
288771
19.8%
334812
10.9%
303819
 
8.6%
293083
 
7.0%
232332
 
5.3%
261902
 
4.3%
321902
 
4.3%
251328
 
3.0%
241324
 
3.0%
Other values (3)3180
 
7.2%
ValueCountFrequency (%)
2011869
26.8%
211254
 
2.8%
221234
 
2.8%
232332
 
5.3%
241324
 
3.0%
251328
 
3.0%
261902
 
4.3%
288771
19.8%
293083
 
7.0%
303819
 
8.6%
ValueCountFrequency (%)
334812
10.9%
321902
 
4.3%
31692
 
1.6%
303819
8.6%
293083
 
7.0%
288771
19.8%
261902
 
4.3%
251328
 
3.0%
241324
 
3.0%
232332
 
5.3%
Distinct342
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size2.5 MiB
2025-11-07T01:34:14.691751image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters132966
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row007
2nd row008
3rd row009
4th row010
5th row011
ValueCountFrequency (%)
312174
 
0.4%
320174
 
0.4%
319174
 
0.4%
290174
 
0.4%
321174
 
0.4%
222173
 
0.4%
291173
 
0.4%
360173
 
0.4%
350173
 
0.4%
348173
 
0.4%
Other values (332)42587
96.1%
2025-11-07T01:34:14.874643image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
120137
15.1%
219823
14.9%
319098
14.4%
016639
12.5%
811057
8.3%
79108
6.8%
49083
6.8%
99071
6.8%
69071
6.8%
58449
6.4%

Most occurring categories

ValueCountFrequency (%)
(unknown)132966
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
120137
15.1%
219823
14.9%
319098
14.4%
016639
12.5%
811057
8.3%
79108
6.8%
49083
6.8%
99071
6.8%
69071
6.8%
58449
6.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown)132966
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
120137
15.1%
219823
14.9%
319098
14.4%
016639
12.5%
811057
8.3%
79108
6.8%
49083
6.8%
99071
6.8%
69071
6.8%
58449
6.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown)132966
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
120137
15.1%
219823
14.9%
319098
14.4%
016639
12.5%
811057
8.3%
79108
6.8%
49083
6.8%
99071
6.8%
69071
6.8%
58449
6.4%

Interactions

2025-11-07T01:34:11.663825image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-07T01:34:11.090782image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-07T01:34:11.278255image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-07T01:34:11.468667image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-07T01:34:11.800548image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-07T01:34:11.140192image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-07T01:34:11.328070image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-07T01:34:11.510841image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-07T01:34:11.848642image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-07T01:34:11.185062image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-07T01:34:11.364600image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-07T01:34:11.572021image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-07T01:34:11.894449image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-07T01:34:11.234943image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-07T01:34:11.409182image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-07T01:34:11.624634image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Correlations

2025-11-07T01:34:14.908361image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
YearEstimate footnoteStandard error footnoteOwnership codeEstimate codeOccupation codeSubcell codeDatatype code
YearNaNNaNNaNNaNNaNNaNNaNNaN
Estimate footnoteNaN1.000-0.1610.0580.2950.058-0.044-0.262
Standard error footnoteNaN-0.1611.0000.0320.3070.0420.006-0.383
Ownership codeNaN0.0580.0321.0000.007-0.0940.0360.002
Estimate codeNaN0.2950.3070.0071.0000.012-0.0060.371
Occupation codeNaN0.0580.042-0.0940.0121.000-0.335-0.016
Subcell codeNaN-0.0440.0060.036-0.006-0.3351.000-0.005
Datatype codeNaN-0.262-0.3830.0020.371-0.016-0.0051.000
2025-11-07T01:34:14.974757image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
YearEstimate footnoteStandard error footnoteOwnership codeEstimate codeOccupation codeSubcell codeDatatype code
YearNaNNaNNaNNaNNaNNaNNaNNaN
Estimate footnoteNaN1.000-0.2510.048-0.3770.016-0.023-0.062
Standard error footnoteNaN-0.2511.0000.032-0.1450.020-0.007-0.118
Ownership codeNaN0.0480.0321.0000.008-0.0870.0250.006
Estimate codeNaN-0.377-0.1450.0081.0000.029-0.0260.000
Occupation codeNaN0.0160.020-0.0870.0291.000-0.506-0.022
Subcell codeNaN-0.023-0.0070.025-0.026-0.5061.0000.008
Datatype codeNaN-0.062-0.1180.0060.000-0.0220.0081.000
2025-11-07T01:34:15.058105image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
CharacteristicCharacteristic categoryDatatypeDatatype codeEstimate categoryEstimate codeEstimate footnoteIndustryIndustry codeOccupationOccupation codeOwnershipOwnership codeStandard error footnoteSubcell code
Characteristic1.0001.0000.0000.0220.0540.0380.0910.1040.1040.1330.2090.0810.0810.0520.839
Characteristic category1.0001.0000.0260.0370.0630.0520.0890.2490.2490.2490.2490.0830.0830.0750.757
Datatype0.0000.0261.0000.8810.3300.2680.5770.0000.0000.0000.0000.0080.0080.4980.000
Datatype code0.0220.0370.8811.0000.3810.0000.4570.0000.0000.000-0.0220.0130.0130.4600.008
Estimate category0.0540.0630.3300.3811.0000.5460.4780.0120.0120.0150.0190.0230.0230.3570.032
Estimate code0.0380.0520.2680.0000.5461.0000.5560.0000.0000.0000.0290.0000.0000.529-0.026
Estimate footnote0.0910.0890.5770.4570.4780.5561.0000.0530.0530.0000.0410.0220.0221.0000.089
Industry0.1040.2490.0000.0000.0120.0000.0531.0001.0000.0640.1050.2070.2070.0000.156
Industry code0.1040.2490.0000.0000.0120.0000.0531.0001.0000.0640.1050.2070.2070.0000.156
Occupation0.1330.2490.0000.0000.0150.0000.0000.0640.0641.0001.0000.1340.1340.0000.156
Occupation code0.2090.2490.000-0.0220.0190.0290.0410.1050.1051.0001.0000.1080.1080.000-0.506
Ownership0.0810.0830.0080.0130.0230.0000.0220.2070.2070.1340.1081.0001.0000.0000.169
Ownership code0.0810.0830.0080.0130.0230.0000.0220.2070.2070.1340.1081.0001.0000.0000.169
Standard error footnote0.0520.0750.4980.4600.3570.5291.0000.0000.0000.0000.0000.0000.0001.0000.045
Subcell code0.8390.7570.0000.0080.032-0.0260.0890.1560.1560.156-0.5060.1690.1690.0451.000

Missing values

2025-11-07T01:34:11.991804image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.
2025-11-07T01:34:12.092094image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2025-11-07T01:34:12.241575image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Estimate categoryDatatypeProvisionOwnershipIndustryOccupationCharacteristic categoryCharacteristicYearEstimateEstimate footnoteStandard errorStandard error footnoteSeries titleSeries IDOwnership codeEstimate codeIndustry codeOccupation codeSubcell codeDatatype codeProvision code
0Benefit combinationsAccess rateAccess to both medical care and retirement benefitsCivilian workersAll industriesAll occupationsAll workersAll workers202566NaN0.8NaNPercent of civilian workers with access to both medical care and retirement benefitsNBU10000000000000028007100000000028007
1Benefit combinationsAccess rateAccess to medical care and no retirement benefitsCivilian workersAll industriesAll occupationsAll workersAll workers20258NaN0.6NaNPercent of civilian workers with access to medical care and no retirement benefitsNBU10000000000000028008100000000028008
2Benefit combinationsAccess rateAccess to retirement and no medical care benefitsCivilian workersAll industriesAll occupationsAll workersAll workers20258NaN0.6NaNPercent of civilian workers with access to retirement and no medical care benefitsNBU10000000000000028009100000000028009
3Benefit combinationsAccess rateAccess to no medical care and no retirement benefitsCivilian workersAll industriesAll occupationsAll workersAll workers202518NaN0.6NaNPercent of civilian workers with access to no medical care and no retirement benefitsNBU10000000000000028010100000000028010
4Benefit combinationsAccess rateAccess to both medical care benefits and life insurance plansCivilian workersAll industriesAll occupationsAll workersAll workers202561NaN0.9NaNPercent of civilian workers with access to both medical care benefits and life insurance plansNBU10000000000000028011100000000028011
5Benefit combinationsAccess rateAccess to medical care benefits and no life insurance plansCivilian workersAll industriesAll occupationsAll workersAll workers202513NaN0.6NaNPercent of civilian workers with access to medical care benefits and no life insurance plansNBU10000000000000028012100000000028012
6Benefit combinationsAccess rateAccess to life insurance plans and no medical care benefitsCivilian workersAll industriesAll occupationsAll workersAll workers20251NaN0.2NaNPercent of civilian workers with access to life insurance plans and no medical care benefitsNBU10000000000000028013100000000028013
7Benefit combinationsAccess rateAccess to no medical care benefits and no life insurance plansCivilian workersAll industriesAll occupationsAll workersAll workers202525NaN0.8NaNPercent of civilian workers with access to no medical care benefits and no life insurance plansNBU10000000000000028014100000000028014
8Benefit combinationsAccess rateAccess to both medical care benefits and defined benefit plansCivilian workersAll industriesAll occupationsAll workersAll workers202523NaN0.5NaNPercent of civilian workers with access to both medical care benefits and defined benefit plansNBU10000000000000028015100000000028015
9Benefit combinationsAccess rateAccess to defined benefit plans and no medical care benefitsCivilian workersAll industriesAll occupationsAll workersAll workers20251NaN0.1NaNPercent of civilian workers with access to defined benefit plans and no medical care benefitsNBU10000000000000028016100000000028016
Estimate categoryDatatypeProvisionOwnershipIndustryOccupationCharacteristic categoryCharacteristicYearEstimateEstimate footnoteStandard errorStandard error footnoteSeries titleSeries IDOwnership codeEstimate codeIndustry codeOccupation codeSubcell codeDatatype codeProvision code
44312Retirement benefitsParticipation rateParticipation rate for all retirement benefitsLocal government workersAll industriesAll occupationsAll workersAll workers202580NaN0.6NaNPercent of local government workers participating in all retirement benefitsNBU590000000000000263205900000000026320
44313Retirement benefitsAccess rateAccess to all retirement benefitsLocal government workersAll industriesAll occupationsAll workersAll workers202590NaN0.7NaNPercent of local government workers with access to all retirement benefitsNBU590000000000000283195900000000028319
44314Retirement benefitsAccess rateAccess to both defined benefit and defined contribution plansLocal government workersAll industriesAll occupationsAll workersAll workers202528NaN1.1NaNPercent of local government workers with access to both defined benefit and defined contribution plansNBU590000000000000283845900000000028384
44315Retirement benefitsTake-up rateTake-up rate for defined benefit plansLocal government workersAll industriesAll occupationsAll workersAll workers202588NaN0.6NaNTake-up rate for defined benefit plans for local government workersNBU590000000000000322925900000000032292
44316Retirement benefitsTake-up rateTake-up rate for defined contribution plansLocal government workersAll industriesAll occupationsAll workersAll workers202549NaN2.1NaNTake-up rate for defined contribution plans for local government workersNBU590000000000000323145900000000032314
44317Retirement benefitsTake-up rateTake-up rate for all retirement benefitsLocal government workersAll industriesAll occupationsAll workersAll workers202589NaN0.5NaNTake-up rate for all retirement benefits for local government workersNBU590000000000000323215900000000032321
44318Healthcare benefitsParticipation rateParticipation rate for healthcare benefitsLocal government workersAll industriesAll occupationsAll workersAll workers202575NaN0.7NaNPercent of local government workers participating in healthcare benefitsNBU594000000000000261725940000000026172
44319Healthcare benefitsTake-up rateTake-up rate for healthcare benefitsLocal government workersAll industriesAll occupationsAll workersAll workers202586NaN0.6NaNTake-up rate for healthcare benefits for local government workersNBU594000000000000321735940000000032173
44320Financial benefitsAccess rateAccess to student loan repaymentLocal government workersAll industriesAll occupationsAll workersAll workers20255NaN0.4NaNPercent of local government workers with access to student loan repaymentNBU597000000000000339675970000000033967
44321Quality of life benefitsAccess rateAccess to flexible work scheduleLocal government workersAll industriesAll occupationsAll workersAll workers20256NaN0.5NaNPercent of local government workers with access to flexible work scheduleNBU598000000000000339685980000000033968